Search CORE

1 research outputs found

Enhancing Optical Character Recognition on Images with Mixed Text Using Semantic Segmentation

Author: Deepak Dharrao
Ketan Kotecha
Lakhan Maheshwari
Pooja Kamat
Rohan Athawade
Shrushti Kumbhare
Shruti Patil
Supriya Mahadevkar
Vijayakumar Varadarajan
Yash Garg
Publication venue: MDPI AG
Publication date: 01/10/2022
Field of study

Optical Character Recognition has made large strides in the field of recognizing printed and properly formatted text. However, the effort attributed to developing systems that are able to reliably apply OCR to both printed as well as handwritten text simultaneously, such as hand-filled forms, is lackadaisical. As Machine printed/typed text follows specific formats and fonts while handwritten texts are variable and non-uniform, it is very hard to classify and recognize using traditional OCR only. A pre-processing methodology employing semantic segmentation to identify, segment and crop boxes containing relevant text on a given image in order to improve the results of conventional online-available OCR engines is proposed here. In this paper, the authors have also provided a comparison of popular OCR engines like Microsoft Cognitive Services, Google Cloud Vision and AWS recognitions. We have proposed a pixel-wise classification technique to accurately identify the area of an image containing relevant text, to feed them to a conventional OCR engine in the hopes of improving the quality of the output. The proposed methodology also supports the digitization of mixed typed text documents with amended performance. The experimental study shows that the proposed pipeline architecture provides reliable and quality inputs through complex image preprocessing to Conventional OCR, which results in better accuracy and improved performance

Directory of Open Access Journals